Back

Frontiers in Molecular Biosciences

Frontiers Media SA

All preprints, ranked by how well they match Frontiers in Molecular Biosciences's content profile, based on 100 papers previously published here. The average preprint has a 0.14% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.

1
Rare Cholesterol Related Disorders: A Sterolomic Library for Diagnosis and Monitoring of Diseases

Griffiths, W. J.; Asgari, M. A.; Yutuc, E.; Khalik, J. A.; Crick, P. J.; Morris, A. A.; Jones, S. A.; Ghosh, A.; Hart, C.; Schoels, L.; Matysik, S.; Laina, I.; Pickrell, W. O.; Moat, S. J.; Wang, Y.

2025-06-23 genetic and genomic medicine 10.1101/2025.06.23.25328695 medRxiv
Top 0.1%
19.7%
Show abstract

Cholesterol is an essential molecule in all animals, it can be made by de novo synthesis and can be taken up from the diet. Inherited disorders of cholesterol synthesis, metabolism and transport lead to disease, often with neurological signs. However, such disorders tend to have non-specific symptoms and can be difficult to diagnose. In addition, there is no single diagnostic test applicable to multiple disorders of cholesterol synthesis, metabolism and transport which can be used to suggest or confirm a diagnosis, resulting in a delay in treatment, particularly in the case of unknown genetic variants. Here, we present the first version of a mass spectrometry sterolomic library to aid the diagnosis of manifold cholesterol-related inherited disorders of metabolism. The library was generated using technology based on simple derivatisation chemistry exploiting the Girard P hydrazine reagent and utilising electrospray ionisation mass spectrometry in the positive and negative modes. The library includes data for 13 autosomal recessive disorders and predicted data for a further 8 disorders.

2
The intrinsically disordered protein glue of myelin: Linking AlphaFold2 predictions to experimental data

Krokengen, O. C.; Raasakka, A.; Kursula, P.

2022-09-15 biophysics 10.1101/2022.09.13.507838 medRxiv
Top 0.1%
14.6%
Show abstract

Numerous human proteins are either partially or fully classified as intrinsically disordered proteins (IDPs). Due to their properties, high-resolution structural information about IDPs is generally lacking. On the other hand, IDPs are known to adopt local ordered structures upon interactions with ligands, which could be e.g. other proteins or lipid membrane surfaces. While recent developments in protein structure prediction have been revolutionary, their impact on IDP research at high resolution remains limited. We took a specific example of two myelin-specific IDPs, the myelin basic protein (MBP) and the cytoplasmic domain of myelin protein zero (P0ct). Both of these IDPs are known to be crucial for normal nervous system development and function, and while they are disordered in solution, upon membrane binding, they partially fold into helices, being embedded into the lipid membrane. We carried out AlphaFold2 predictions of both proteins and analysed the models in light of previously published data related to solution structure and molecular interactions. We observe that the predicted models have helical segments that closely correspond to the characterised membrane-binding sites on both proteins. We furthermore analyse the fits of the models to SAXS data from the same IDPs. Artificial intelligence-based models of IDPs appear to be able to provide detailed information on the ligand-bound state of these proteins, instead of the form dominating free in solution. We further discuss the implications of the predictions for normal mammalian nervous system myelination and their relevance to understanding disease aspects of these IDPs.

3
Analysis of SARS-CoV-2 ORF3a structure reveals chloride binding sites

Marquez-Miranda, V.; Rojas, M.; Duarte, Y.; Diaz-Franulic, I.; Holmgren, M.; Cachau, R.; Gonzalez-Nilo, F. D.

2020-10-22 biophysics 10.1101/2020.10.22.349522 medRxiv
Top 0.1%
14.5%
Show abstract

SARS-CoV-2 ORF3a is believed to form ion channels, which may be involved in the modulation of virus release, and has been implicated in various cellular processes like the up-regulation of fibrinogen expression in lung epithelial cells, downregulation of type 1 interferon receptor, caspase-dependent apoptosis, and increasing IFNAR1 ubiquitination. ORF3a assemblies as homotetramers, which are stabilized by residue C133. A recent cryoEM structure of a homodimeric complex of ORF3a has been released. A lower-resolution cryoEM map of the tetramer suggests two dimers form it, arranged side by side. The dimers cryoEM structure revealed that each protomer contains three transmembrane helices arranged in a clockwise configuration forming a six helices transmembrane domain. This domains potential permeation pathway has six constrictions narrowing to about 1 [A] in radius, suggesting the structure solved is in a closed or inactivated state. At the cytosol end, the permeation pathway encounters a large and polar cavity formed by multiple beta strands from both protomers, which opens to the cytosolic milieu. We modeled the tetramer following the arrangement suggested by the low-resolution tetramer cryoEM map. Molecular dynamics simulations of the tetramer embedded in a membrane and solvated with 0.5 M of KCl were performed. Our simulations show the cytosolic cavity is quickly populated by both K+ and Cl-, yet with different dynamics. K+ ions moved relatively free inside the cavity without forming proper coordination sites. In contrast, Cl- ions enter the cavity, and three of them can become stably coordinated near the intracellular entrance of the potential permeation pathway by an inter-subunit network of positively charged amino acids. Consequently, the central cavitys electrostatic potential changed from being entirely positive at the beginning of the simulation to more electronegative at the end.

4
Profiling the Expression of Transportome Genes in cancer: A systematic approach

Visentin, L.; Scarpellino, G.; Munaron, L.; Ruffinatti, F. A.

2023-07-19 bioinformatics 10.1101/2023.07.18.549498 medRxiv
Top 0.1%
14.3%
Show abstract

The transportome, the -omic layer encompassing all Ion Channels and Transporters (ICTs), is crucial for cell physiology. It is therefore reasonable to hypothesize a role of the transportome in disease, and in particular in cancer. Here, we present the Membrane Transport Protein DataBase (MTP-DB), a database collecting information on ICTs, and a pipeline that takes expression data and the MTP-DB as input to produce a broad overview of transportome dysregulation in cancer. The MTP-DB may prove useful for the study of the transportome in general, and the pipeline may be used to study the transportome in other diseases. Both tools are open source and can be found on GitHub at TCP-Lab/mtp-db and TCP-Lab/transportome_profiler, under permissive licenses. We detect that the transportome is dysregulated in cancer, and that dysregulation patterns are shared among different cancer types. It is still unclear how these patterns are linked to cancer patho-physiology.

5
Structural, biophysical and biological analysis and characterisation of IRF4 DNA-binding domain mutations associated with multiple myeloma

Tatum, N. J.; Scott, R.; Doody, G.; Hickson, I.; Jennings, C. E.; Martin, M. P.; Tooze, R.; Tucker, J. A.; Wittner, A.; Wang, L.-Z.; Wright, E. K.; Wedge, S. R.; Noble, M. E. M.

2025-03-14 biophysics 10.1101/2025.03.12.642038 medRxiv
Top 0.1%
12.8%
Show abstract

IRF4, a transcription factor in the interferon regulatory factor family, is a key regulator in immune cell differentiation indicated to have an essential role in the development of lymphoid malignancies. Genome-wide association studies previously identified a set of overlapping mutations within the IRF4 DNA-binding domain in T-cell lymphoma and multiple myeloma, several of which appeared to be associated with better prognosis. Mapping these mutations to the known crystal structure of the IRF4:PU.1:DNA ternary complex and a new structure of the IRF4 DNA-binding domain in the apo state suggested they might interfere with DNA-binding, directly or via destabilisation of domain structure. We characterised these cancer-associated IRF4 mutants experimentally using the recombinant IRF4 DNA-binding domain (DBD) in vitro and examined the clinically relevant mutant K123R in cellulo. Using fluorescence polarisation, surface plasmon resonance, differential scanning fluorimetry and molecular dynamics, we find that mutation may give rise to significant differences in DNA-binding kinetics and thermal stability without compromising the affinity of IRF4 DNA-binding. The K123R IRF4 mutant showed increased transcriptional activity via a luciferase reporter assay and increased nuclear partitioning, which may be preferentially selected for in multiple myeloma. We discuss our observations in relation to the improved prognosis conferred by this mutation.

6
Modulation of Electrostatic Interactions as a Mechanism of Cryptic Adaptation of Colwellia to High Hydrostatic Pressure

Makhatadze, G. I.

2024-07-29 biophysics 10.1101/2024.07.28.605522 medRxiv
Top 0.1%
12.6%
Show abstract

The role of various interactions in determining the pressure adaptation of the proteome in piezophilic organisms remains to be established. It is clear that the adaptation is not limited to one or two proteins, but has a more general evolution of the characteristics of the entire proteome, the so-called cryptic evolution. Using the synergy between bioinformatics, computer simulations, and some experimental evidence, we probed the physico-chemical mechanisms of cryptic evolution of the proteome of psychrophilic strains of model organism, Colwellia, to adapt to life at various pressures, from the surface of the Arctic ice to the depth of the Mariana Trench. From the bioinformatics analysis of proteomes of several strains of Colwellia, we have identified the modulation of interactions between charged residues as a possible driver of evolutionary adaptation to high hydrostatic pressure. The computational modeling suggests that these interactions have different roles in modulating the function-stability relationship for different protein families. For several classes of proteins, the modulation of interactions between charges evolved to lead to an increase in stability with pressure, while for others, just the opposite is observed. The latter trend appears to benefit enzyme activity by countering structural rigidification due to the high pressure.

7
Structural Assessment of the Full-Length Wild-Type Tumor Suppressor Protein p53 by Mass Spectrometry-Guided Computational Modeling

Di Ianni, A.; Tueting, C.; Kipping, M.; Ihling, C. H.; Koeppen, J. S.; Iacobucci, C.; Arlt, C.; Kastritis, P. L.; Sinz, A.

2022-11-11 biochemistry 10.1101/2022.11.11.516092 medRxiv
Top 0.1%
12.5%
Show abstract

The tetrameric tumor suppressor p53 represents a great challenge for 3D-structural analysis due to its high degree of intrinsic disorder (ca. 40%). We aim to shed light on the structural and functional roles of p53s C-terminal region in full-length, wild-type human p53 tetramer and their importance for DNA binding. For this, we employed complementary techniques of structural mass spectrometry (MS) in an integrated approach with AI-based computational modeling. Our results show no major conformational differences in p53 between DNA-bound and DNA-free states, but reveal a substantial compaction of p53s C-terminal region. This supports the proposed mechanism of unspecific DNA binding to the C-terminal region of p53 prior to transcription initiation by specific DNA binding to the core domain of p53. The synergies between complementary structural MS techniques and computational modeling as pursued in our integrative approach is envisioned to serve as general strategy for studying intrinsically disordered proteins (IDPs) and intrinsically disordered region (IDRs).

8
DLST--a Cuproptosis-related Gene--is a Potential Diagnostic and Prognostic Factor for Clear Cell Renal Cell Carcinoma

Wang, H.; Ma, X.; Li, S.; Ni, X.

2023-04-28 genetic and genomic medicine 10.1101/2023.04.27.23289219 medRxiv
Top 0.1%
12.3%
Show abstract

Clear cell renal cell carcinoma (ccRCC) accounts for the highest number of renal malignancies and 3% of all adult cancers. The incidence of ccRCC is increasing worldwide, and its prognosis is poor. Approximately 30% of the patients are diagnosed at a late stage and are frequently asymptomatic. Cuproptosis is a new type of cell death that is regulated by Cu ions. As cuproptosis is associated with cancer development, we hypothesized that changes in the expression of cuproptosis-related genes (CRGs) are associated with the prognosis of ccRCC, and that CRGs can serve as biomarkers for the diagnosis and prognosis of ccRCC. In the present study, we explored the correlation between CRGs and ccRCC prognosis by analyzing publicly available data. We analyzed the clinical information and RNA-sequencing data in The Cancer Genome Atlas using bioinformatics tools. Dihydrolipoamide S-succinyltransferase (DLST) was identified as a novel gene with predictive and diagnostic potential. CRGs were under-expressed in ccRCC samples, and downregulation of DLST was highly associated with poor prognosis. Cox univariate and multivariate regression analyses revealed that DLST could serve as an independent prognostic factor for ccRCC. Further, functional enrichment analysis indicated that low expression of DLST may affect immune function. Our results strongly indicate that DLST plays an important role in ccRCC progression and may serve as an independent diagnostic and prognostic biomarker for ccRCC. Therefore, DLST is a potential therapeutic target for patients with ccRCC.

9
Revisiting the effects of MDR1 Variants using computational approaches

Gutman, T.; Tuller, T.

2023-09-03 genetic and genomic medicine 10.1101/2023.09.02.23294978 medRxiv
Top 0.1%
12.3%
Show abstract

P-glycoprotein, encoded by the MDR1 gene, is an ATP-dependent pump that exports various substances out of cells. Its overexpression is related to multi drug resistance in many cancers. Numerous studies explored the effects of MDR1 variants on p-glycoprotein expression and function, and on patient survivability. T1236C, T2677C and T3435C are prevalent MDR1 variants that are the most widely studied, typically in-vitro and in-vivo, with remarkably inconsistent results. In this paper we perform computational, data-driven analyses to assess the effects of these variants using a different approach. We use knowledge of gene expression regulation to elucidate the variants mechanism of action. Results indicate that T1236C increases MDR1 levels by 2-fold and is correlated with worse patient prognosis. Additionally, examination of MDR1 folding strength suggests that T3435C potentially modifies co-translational folding. Furthermore, all three variants reside in potential translation bottlenecks and likely cause increased translation rates. These results support several hypotheses suggested by previous studies. To the best of our knowledge, this study is the first to apply a computational approach to examine the effects of MDR1 variants.

10
Convolutional neural network approach for the automated identification of in cellulo crystals

Kardoost, A.; Schönherr, R.; Deiter, C.; Redecke, L.; Lorenzen, K.; Schulz, J.; de Diego, I.

2023-03-29 biophysics 10.1101/2023.03.28.533948 medRxiv
Top 0.1%
12.3%
Show abstract

In cellulo crystallization is a rarely occurring event in nature. Recent advances, making use of heterologous overexpression, can promote the intracellular formation of protein crystals, but new tools are required to detect and to characterize these targets in the complex cell environment. In the present work we make use of Mask R-CNN, a Convolutional Neural Network (CNN) based instance segmentation method, for the identification of either single or multi-shaped crystals growing in living insect cells, using conventional bright field images. The algorithm can be rapidly adapted to recognize different targets, with the aim to extract relevant information to support a semi-automated screening pipeline, with the purpose to aid in the development of the intracellular protein crystallization approach.

11
Nearest Neighbour Interactions between Amino Acid Residues in Short Peptides and Coil Libraries

Schweitzer-Stenner, R.

2026-01-22 biophysics 10.64898/2026.01.19.700493 medRxiv
Top 0.1%
12.2%
Show abstract

Intrinsically disordered proteins (IDP) or proteins with intrinsically disordered regions (IDR) perform a plethora of functions mostly in a cellular environment. As unfolded proteins, IDPs can adopt molten globule or coil ensembles of conformations. Regarding the latter the question arises whether they are describable as a self-avoiding random coil. Locally, this requires that amino acid residues sample the entire sterically allowed region of the Ramachandran plot with very similar probabilities and independent on the conformational dynamics of their neighbours. However, various lines of experimental and bioinformatic evidence suggest a more restricted, side chain and nearest neighbor dependent conformational space for individual residues. Over the last 25 years short peptides and coil libraries were employed to determine conformational propensities of amino acid residues in unfolded states. The question arises whether conformational ensembles obtained from these two sources are comparable. In this paper, a variety of metrics were used to compare Ramachandran plots of a limited number of GXYG peptides (X,Y: guest residues) with XY dimers in the coil library of Ting et al.(PLOS 6, e1000763, 2010). The results reveal major differences between corresponding plots, which might in part due to the fact that solely the influence of one of the two neighbours of a given residue is probed by the above coil library while averages were taken over the respective opposite neighbours. The presented results suggest that coil libraries alone might not be a sufficient tool for determining the characteristics of statistical coils of IDPS and IDRs alike.

12
Nucleoid associated proteins and their effect on E. coli chromosome

Gupta, A.; Abdul, W.; Mondal, J.

2020-11-05 biophysics 10.1101/2020.11.05.369934 medRxiv
Top 0.1%
10.8%
Show abstract

A seemingly random and disorganized bacterial chromosome, in reality, is a well organized nucleus-like structure, called the nucleoid, which is maintained by several nucleoid associated proteins(NAPs). Here we present an application of a previously developed Hi-C based computational method to study the effects of some of these proteins on the E. coli chromosome. Simulations with encoded Hi-C data for mutant, hupAB deficient, E. coli cells, revealed a decondensed, axially expanded chromosome with enhanced short range and diminished long range interactions. Simulations for mutant cells deficient in FIS protein revealed that the effects are similar to that of the hupAB mutant, but the absence of FIS led to a greater disruption in chromosome organization. Absence of another NAP, MatP, known to mediate Ter macrodomain isolation, led to enhanced contacts between Ter and its flanking macrodomains but lacked any change in matS sites localization. Deficiency of MukBEF, the only SMC complex present in E. coli, led to disorganization of macrodomains. Upon further analysis, it was observed that the above mutations do not significantly impact the local chromosome organization (~ 100 Kb) but only affect the chromosome on a larger scale (>100 Kb). These observations shed more light on the sparsely explored effects of NAPs on the overall chromosome organization and helps us understand the myriad complex interactions NAPs have with the chromosome.

13
Diffusive dynamics of Aspartate α-decarboxylase (ADC) liganded with D-serine in aqueous solution

Raskar, T.; Niebling, S.; Devos, J. M.; Yorke, B. A.; Härtlein, M.; Huse, N.; Forsyth, T. V.; Seydel, T.; Pearson, A. R.

2020-08-12 biophysics 10.1101/2020.08.11.244939 medRxiv
Top 0.1%
10.6%
Show abstract

Incoherent neutron spectroscopy, in combination with dynamic light scattering was used to investigate the effect of ligand binding on the center-of-mass self-diffusion and internal diffusive dynamics of E.coli aspartate -decarboxylase (ADC). The X-ray crystal structure of the D-serine inhibitor complex with ADC was also determined, and molecular dynamics simulations used to further probe the structural rearrangements that occur as a result of ligand binding. These experiments reveal the existence of higher order oligomers of the ADC tetramer on ns-ms time-scales, and also show that ligand binding both affects the ADC internal diffusive dynamics and appears to further increase the size of the higher order oligomers.

14
Estimation of the global burden of autosomal recessive rare inborn errors of metabolism

Mondal, S.; Dutta, A. K.; Goswami, K.

2025-02-18 genetic and genomic medicine 10.1101/2025.02.14.25322285 medRxiv
Top 0.1%
10.3%
Show abstract

While many Rare Inborn Errors of Metabolism are treatable conditions their optimal diagnosis and treatment is a challenge for nations with low resources. Moreover, the population prevalence of these conditions is largely unknown. The availability of large genomic datasets brings the opportunity to estimate population carrier frequency of autosomal recessive IEMs. This would help to generate diseases burden statistics for better allocation of resources. In the current work we estimated the gene specific combined minor allele frequency of pathogenic variants from the gnomAD dataset for 235 genes associated with IEM phenotypes in OMIM. As per our estimation almost one third of the Global population is carrier for a pathogenic variant responsible for rare autosomal recessive inborn error of metabolism with the highest carrier frequency in the Ashkenazi Jews. Globally per thousand live births approximately five children are born with an ARIEM. European Finnish have the highest burden of nine out of 10,000 live births. With 25 million live births per year India is expected to have at least 8,025 newborns with an ARIEM. Since many of these diseases are treatable early newborn screening holds the key to ensure optimal management of these children.

15
Unique proteome signatures in ICU patients with COVID-19 and delirium: an observational study

Edel, A.; Sreekanth, J.; Kurth, F.; Ralser, M.; Demichev, V.; Muelleder, M.; Blanc, E.; Spies, C.

2024-12-12 intensive care and critical care medicine 10.1101/2024.12.09.24317145 medRxiv
Top 0.1%
10.2%
Show abstract

BackgroundDelirium is common in COVID-19 intensive care unit (ICU) patients. Biomarkers for prediction, detection, and monitoring are missing. Unbiased omics analyses are warranted to gain a systems biology view on pathophysiology. MethodsThis prospective observational satellite study aims to investigate the proteome signatures of COVID-19 ICU patients, comparing those with delirium to those without. This study was conducted in ICUs of a university hospital between March 2020 and September 2021. ICU patients of legal age with a positive SARS-CoV-2 test were screened daily for oversedation and delirium. Blood samples were taken thrice a week. 457 samples were analyzed using data-in-dependent acquisition mass spectrometry to determine protein levels. A mixed-effects regression model was developed to identify proteins significantly influenced by delirium, accounting for sex and age as confounders. This model also aimed to determine proteins that were either up- or downregulated in association with delirium. Additionally, an enrichment analysis was conducted to examine the biological pathways linked to these delirium-associated proteins. ResultsOut of 360 ICU patients, 69 were analyzed for protein profiling. Out of these 69 patients, 42 patients (60.9%) had delirium on ICU admission, and 27 (39.1%) did not. Based on the multivariate model, the analysis of 204 proteins unfolded 125 (61.3%) to be differentially expressed. In total, 80.8% (n=101) of these 125 proteins were associated with delirium. Of these, 10 proteins were uniquely associated with delirium and were not significant in the multivariate model (SERPING1, SERPINA7, HP, TGFBI, CD5L, IGHV3-7, IGHV1-46, IGHV3-15, IGHV3-23, and "IGHV4-34;IGHV4-38-2"). In the univariate model for delirium, six out of 111 significant proteins showed increased expression with a log2FC > 0.5: PIGR, MST1, LBP, CRP, SAA1, and "SAA1;SAA2"; while three showed decreased expression with a log2FC < - 0.5: HP, PPBP, and "HP;HPR". The enrichment analysis of delirium-influenced proteins revealed three significant pathways: "Network map of SARS-CoV-2 signaling" (M42569/WP5115), "Acute inflammatory response" (M10617), and "Regulation of defense response" (M15277). ConclusionWe identified a unique proteomic signature in COVID-19 ICU patients with delirium, including up- and downregulated proteins. These findings may provide potential biomarker candidates for the assessment of delirium risk and its underlying causes. These findings could be a further step towards the development of personalized, causative treatments for delirium and its monitoring in the ICU. Trial registrationThe study was retrospectively registered in the German Clinical Trials Register on May 13, 2020 (DRKS00021688).

16
Lysenin toxin insertion mechanism is Calcium-dependent

Munguira, I. L. B.

2019-09-18 biophysics 10.1101/771725 medRxiv
Top 0.1%
10.2%
Show abstract

Pore Forming Toxins (PFTs), formed mainly by virulence factors of bacteria, belongs to Pore Forming Protein (PFP) family. Secreted as soluble monomers, they bind specific targets in membranes where their oligomerization and insertion place. Lysenin, a member of the PFTs, forms and oligomer after sphingomyelin binding, the so-called prepore, which become inserted forming a pore after a conformational change triggered by a pH decrease. In crowded conditions, oligomers tends to stay in prepore form because the prepore-to-pore transition is sterically blocked. In this study, we investigate the effect of calcium ions in those crowded conditions, finding that calcium act as a trigger for lysenin insertion. We localize the residues responsible for calcium sensitivity in a small -helix. Our results are not only one of the few complete structural descriptions of prepore-to-pore transitions but the very first that involves a calcium triggering mechanism. The presence of glutamic or aspartic acids in the insertion domains could be an indication that calcium may be a general trigger for PFTs and more generally PFP.

17
SSI: A Statistical Sensitivity Index for Chemical Reaction Networks in cancer

Biddau, G.; Caviglia, G.; Piana, M.; Sommariva, S.

2023-01-15 systems biology 10.1101/2023.01.12.523784 medRxiv
Top 0.1%
10.1%
Show abstract

SO_SCPLOWUMMARYC_SCPLOWAt the cellular level, cancer is triggered by mutations of the proteins involved in signalling networks made of hundreds of reacting species. The corresponding mathematical model consists of a large system of non-linear Ordinary Differential Equations for the unknown proteins concentrations depending on a consistently large number of kinetic parameters and initial concentrations. For this model, the present paper considers the problem of assessing the impact of each parameter and initial concentration on the systems output. More specifically, we introduced a statistical sensitivity index whose values can be easily computed by means of principal component analysis, and which leads to the partition of the parameters and initial concentrations sets into sensible and non-sensible families. This approach allows the identification of those kinetic parameters and initial concentrations that mostly impact the mutation-driven modification of the proteomic profile at equilibrium, and of those pathways in the network that are mostly affected by the presence of mutations in the cancer cell.

18
Machine Learning Approach to Integrate and Analyse Multiomics data to Identify Actionable Biomarkers for Head and Neck Squamous Cell Carcinoma (HNSCC)

Panchal, K.; Arockia Rajesh Packiam, K.; MAJUMDAR, S.

2025-10-13 genetic and genomic medicine 10.1101/2025.10.09.25335922 medRxiv
Top 0.1%
10.1%
Show abstract

Head and neck squamous cell carcinoma (HNSCC) is ranked sixth among all the common cancers worldwide and is a major cause of death. A molecular understanding of disease progression can aid in timely diagnosis and therapy. This study aims to identify potential HNSCC biomarkers using a machine learning-based approach to integrate and analyse multi-omics data (namely publicly available Human Papillomavirus (HPV) negative patients multiomics datasets from the CPTAC-HNSCC project, including transcriptomics, methylomics, proteomics, and phosphoproteomics). A three-step feature selection method was utilized to identify potential molecular biomarkers using machine learning algorithms. The top 1000 important features (genes) were filtered using Mutual Information, followed by a random forest-based feature importance ranking, and Recursive Feature Elimination with cross-validation coupled with Support Vector Machine (SVM-RFECV) to get a minimal gene set important for machine learning based tumor-normal classification task. To benchmark these top-selected features, Logistic Regression (LogR), Random Forest (RF), Multi-layer perceptron (MLP), and Support Vector Machines (SVC) were used. The prediction performance of classifiers trained on these selected gene sets was evaluated using the accuracy metric, which was then compared against that of models trained on randomly selected gene sets. The entire workflow was repeated 100 times for different random states to establish statistical confidence in the pipeline and the selected gene set. Our integrative approach identified both omics-specific and cross-omics candidate genes with very high classification accuracy, ranging from [~] 95% to 100%. These genes reveal convergent biological processes central to HNSCC pathogenesis, which reinforces the robustness of the methodology used, which can be adopted to analyse similar multiomics datasets for other pathologies and foundational biological questions.

19
Pathway centered analysis to guide clinical decision-making in precision medicine

Carvalho, L. B.; Capelo, J. L.; Lodeiro, C.; Dhir, R.; Campos Pinheiro, L.; Medeiros, M.; Santos, H. M.

2021-04-30 biochemistry 10.1101/2021.04.30.442131 medRxiv
Top 0.1%
10.1%
Show abstract

Changes in the human proteome caused by disease before, during and after medical care is phenotype-dependent, so the proteome of each individual at any time point is a snapshot of the bodys response to disease and to disease treatment. Here, we introduce a new concept named differential Personal Pathway index (dPPi). This tool extracts and summates comprehensive disease-specific information contained within an individuals proteome as a holistic way to follow the response to disease and medical care over time. We demonstrate the principle of the dPPi algorithm on proteins found in urine from patients suffering from neoplasia of the bladder. The relevance of the dPPi results to the individual clinical cases is described. The dPPi concept can be extended to other malignant and non-malignant diseases, and to other types of biopsies, such as plasma, serum or saliva. We envision the dPPi as a tool for clinical decision-making in precision medicine.

20
Ion Mobility Mass Spectrometry Guided Modeling with AlphaFold and Rosetta Improves Protein Complex Structure Prediction

Narayanasamy, A.; Drake, Z. C.; Turzo, S. M. B. A.; Rolland, A. D.; Prell, J. S.; Wysocki, V. H.; Lindert, S.

2026-02-16 biophysics 10.64898/2026.02.16.706193 medRxiv
Top 0.1%
10.1%
Show abstract

Ion mobility mass spectrometry (IM-MS) provides valuable structural information about protein shape and size through collision cross section (CCS). However, it lacks atomic level structural detail. While AlphaFold has been successful in predicting monomeric protein structure, it can struggle with modeling protein complexes. To address these limitations, we developed a method that integrates IM-MS data with AlphaFold and Rosetta to improve complex structure prediction. Our approach uses experimental CCS data to guide the assembly of AlphaFold predicted subunits using a Rosetta docking pipeline and evaluating the resulting complexes with a newly developed score. Using this strategy, we were able to improve root mean square deviation (RMSD) values for 26 of 38 (68%) complexes compared to AlphaFold-Multimer. Furthermore, 16 of these systems improved significantly from greater than 4 [A] RMSD to less than 4 [A]. This method demonstrates a robust approach to overcome limitations in complex assembly modeling. Table of Contents Graphic O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=182 SRC="FIGDIR/small/706193v1_ufig1.gif" ALT="Figure 1"> View larger version (68K): org.highwire.dtl.DTLVardef@cf71c3org.highwire.dtl.DTLVardef@135e09aorg.highwire.dtl.DTLVardef@2cc2fcorg.highwire.dtl.DTLVardef@b53feb_HPS_FORMAT_FIGEXP M_FIG In this integrative modeling work, protein complex structures were modeled by combining AlphaFold predicted subunits with Rosetta docking. Collision cross section data from ion-mobility mass spectrometry were used as evaluation constraints and docked models were scored using the IM-complex score. The best scoring models generally represent accurate protein complex structures. C_FIG